EPJ manuscript No. 

(will be inserted by the editor) 



A remarkable emergent property of spontaneous (amino acid 
content) symmetry breaking 



o 

(N 



R.A. Broglia 1 ' 2 ' 3 ' 4 

1 Department of Physics, University of Milan, via Celoria 16, 1-20133 Milan, Italy 

2 INFN, Milan Section, Italy 

3 The Niels Bohr Institute, University of Copenhagen, Blegdamsveg 17, DK-2100 Copenhagen, Denmark 

4 Foldless S.r.l. Via Valosa di Sopra, 9, 1-20050, Monza(MB), Italy 

the date of receipt and acceptance should be inserted later 



in 



PQ 
d 



> 

in 

m 
en 

en 

o 

> 



Abstract Learning how proteins fold will hardly have any impact in the way conventional — active site 
centered — drugs are designed. On the other hand, this knowledge is proving instrumental in defining 
a new paradigm for the identification of drugs against any target protein: folding inhibition. Targeting 
folding renders drugs less prone to elicit spontaneous genetic mutations which in many cases, notably 
in connection with viruses like the Human Immunodeficiency Virus (HIV), can block therapeutic action. 
From the progress which has taken place during the last years in the understanding of the becoming of a 
protein, and how to read from the corresponding sequences the associated three-dimensional, biologically 
active, native structure, the idea of non-conventional (folding) inhibitors and thus of leads to eventual 
drugs to fight disease, arguably, without creating resistance, emerges as a distinct possibility. 

PACS. XX.XX.XX No PACS code given 



To the questionQ] "what is life?" one is forced to an- 
swer that life is not one but two things [5]. Which ones ? 
Replication and metabolism. The molecules of DNA and 
RNA are responsible for the first function 3,4, proteins 
for the second [5]. Because software (replication) is neces- 
sary a parasite of hardware (proteins) , the becoming of a 
protein carries, to a large extent, the secret of life [6] . 

A possible scenario for this becoming suggests that, 
starting from random polypeptide chains (i.e. chains where 
the probability that a site is occupied by a given amino 
acid is 1/20) containing some tens of amino acids (Fig. 
fl|a)), evolution rang a large number of all possible se- 
quences until it clicked on a class of them containing few 
(4-6) strongly hydrophobic ("hot") amino acidfR which 



a In keeping with the fact that there are 20 different types of 
amino acids, one can associate with each site of the protein a 
quasispin (see e.g. [7]) of value 19/2, interpreting each projec- 
tion as a given amino acid realization. In a random polymer, 
each projection is equally probable for any site, and no align- 
ment is observed. Lowering the evolutionary temperature [H]|S] 
one finds that there exists a critical temperature (equivalent to 
the Curie temperature in the case of a ferromagnet and which 
can be simply calculated making use of the random energy 
model [10] (Ec — > Tc), see also [ITJ and refs. therein) below 
which one observes quasispin alignment at specific sites (hot 
sites), an example of which is provided by the residue occupy- 
ing site 33 (Leucine (Leu)) of the HIV-l-PR, a strongly hy- 
drophobic, highly conserved amino acid which plays the role of 



induced the formation and provided the varied stability 
to Local Elementary Structure (LES)[H1[I1IIH] (Fig.[l|b)) 
which flicker in and out the native conformation[20 (see 
also App. A). The segment of the polypeptide chain as- 
sociated with a LES contains approximately 10-15 amino 
acids. This in keeping with the fact that segments of this 
length are able to fold in milliseconds, consistent with the 
fact that proteins containing about one hundred amino 
acids fold in times of the order of tens of milliseconds |21). 

Strongly hydrophobic, highly conserved (hot) amino 
acids 22,23,24 inducing local structuring of the protein 
(see App. B) are responsible to a large extent for the se- 
lective interaction (molecular recognition) between a small 
group (2-4) of complementary[25] (in the sense of left and 



hub [TSinS] (see also PJ] Fig. 39 (b) and Table VI) in the native 
conformation of this enzyme. Assuming that the high quasispin 
projections correspond to strongly hydrophobic amino acids, 
one essentially finds an alignment pointing along the (positive) 
quantization axis. That is, there is a priviledged orientation in 
quasispin space, and thus an associated spontaneous breaking 
of amino acid content symmetry, symmetry associated with 
the invariance of the Hamiltonian describing the interaction 
among amino acids with respect to amino acid occupancy of 
the different sites of the protein. This is similar to the spon- 
taneous breaking of rotational symmetry associated with the 
ferromagnetic state below the Curie temperature, symmmetry 
respected by the original spin-spin Hamiltonian (see e.g. |15l 
[16] and PH. 
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right hands) LES (one or two per chain) . When this group 
of LES docks, they give rise to the (postcritical) fold- 
ing nucleus [Sfj] (FN) pc (Fig.HTc)), which inevitably grows 
into a folded oligomer, that is, a unique three-dimensional 
structure with incipient specific biological functions, e.g. 
some amount of enzymatic activitv|20j. Polymerization 
(Fig. Hid)) gives rise to typical globular proteins (or fold- 
ing domains), containing 100-120 amino acids[27j (Fig. 
He)) (see App. C). 

The setting in place, by evolution, of hot amino acids [TT], 
that is the occupation of certain sites of the polypep- 
tide chain by a single type of amino acid with probabil- 
ity close to 1 (conservation) can be viewed as a spon- 
taneous breaking of amino acid content (second order- 
like phase transition[TB] (see also [35]) corresponding to 
the transition between random chains and good folder^]). 
The most important emergent propertjF] associated with 
this, symmetry-breaking, second order-like phase transi- 
tion is folding (first order phase transition; coexistence of 
nativeF] (N) and denatured (D) states, see Fig. file)), a 
phcnonrcnon tantamount to biological activity and even- 
tually to metabolic function and thus to the emergence of 
life on earth. 

LES, which can be viewed as incipient, virtual sec- 
ondary structures already present with varied stability 
in the denatured state, control not only folding (Fig. [2]), 
but also aggregation [Tg][3T] . Because folding is much faster 
than collision events between solvent exposed LES of dif- 
ferent proteins belonging to each group of proteins of a 
given type present in the cell (« 10 2 ), but much slower 
than collisions between the solvent exposed LES associ- 
ated with all of the 10 6 proteins belonging to a cell[32,33, 
133] . one is essentially forced to assume that evolution has 
tooled the LES of each protein to recognize their comple- 
mentary (like left and right hands) LES and essentially 
nothing else (LES— conjecture) (see Fig. [3]). This conjec- 
ture is consistent, among other things, with the dearth of 
folds revealed by the proteomic project [55]. project aimed 
at determining the native conformations of all human pro- 
teins. 

b Second order phase transitions are, as a rule, connected 
with changes in symmetry. Consequently, the two phases 
cannot coexist (e.g. aligned (ferromagnetic) and non-aligned 
(paramagnetic) phases). Starting at a critical temperature, the 
new phase grows continuously. This is at variance with first or- 
der phase transitions (e.g. denatured —¥ native, taking place in 
the case of the folding of proteins), where, in the thermody- 
namic limit, the two phases can coexist (discontinuous changes 
of the order parameter between the two phases). In particular, 
at the folding temperature T/ the probabilities that the systenr 
is in the native and in the denatured states are equal. 

c That is properties not present in the Hamiltonian describ- 
ing the system, neither in the "particles" (amino acids) forming 
it. In the case of paramagnetic— » ferromagnetic phase transition 
emergent properties are, for example, domain walls, magnetic 
rigidity, etc [291150] . 

d Good folders (and consequently hot amino acids and LES) 
lead to native states which can be viewed as scale free 
networks H2UT51. 



To make virtual LES become real, one can intervene 
the folding process with peptides displaying identical se- 
quence of a LES of the protein under study [3"tT (Fig. Bias 
well as App. D). Such peptides, called p-LES, can bind a 
complementary LES leading to misfolding and thus com- 
peting with productive folding [TB1I3T1I3"5] . Circular dichro- 
ism is consistent with such a scenario [39, 40 , while NMR 
indicates that the only amino acids which give a signal 
close to that associated with the native state of the pro- 
tein are those which bind in the native state to the LES 
of which the peptide p-LES is a replica [SI] . 

This insight concerning the validity of the LES-conjecture 
can be used at profit to understand aggregation, to design 
novel enzymes as well as help solving the protein fold- 
ing problem: for this purpose one should design all pos- 
sible LES— >FN— s-folds and establish the connection be- 
tween amino acid sequence and LES (3 steps strategy, 
cf. ref.[19|). Although this is a central issue in the study 
of proteins, arguably, the most promising role of p-LES 
is that of being leads to non-conventional (folding) in- 
hibitors (Fig. [5]) , drugs likely not to create resistance. In 
fact, the only way a target protein can avoid that a p- 
LES binds to its complementary LES is by mutating the 
hot amino acids of its LES. But such an event will lead 
to denaturation. This does not mean that a target pro- 
tein cannot develop resistance. It only means that to do 
so a concerted mutation of a large number of amino acids 
has to take place in a single step, an event which is very 
unlikely [42] . In fact, there are no point-mutation-paths 
connecting the (few) possible FN of a protein (Fig. [61 see 
also 03]). 

From this vantage point of view, an eventual confir- 
mation of the validity of the LES-conjecture would imply 
the emergence of a new paradigm in the design of drugs 
and thus in the cure of diseases, in particular infectious 
diseases: (high mutation barrier) folding inhibitiorr] Fur- 
thermore, these drugs should display little side effects in 
keeping with the LES-conjecture (see also Fig. [3]). In the 
case of the HIV-l-PR such advantage should also carry 
to the low toxicity for the proteasome. 

To shed light on this issue, assuming the target protein 
to be an enzyme like e.g. the HIV-1-Protease, folding inhi- 
bition and thus loss of activity has to be measured in vitro 
(purified enzyme [39 ), in acute and in chronically infected 
cells (virus [441145 ). in vitro passage over long periods of 
time ( [461147] ) and in living organisms, that is in test an- 



e "First we guess it. Then we compute the consequences of 
the guess to see what would be implied if the law we guess 
is right. Then we compare the results of the computation to 
nature, with experiment or experience, compare it directly with 
observation, to see if it works. If it disagrees with experinrent 
it is wrong. In that simple statement is the key to science. It 
does not make any difference how beautiful your guess is. It 
does not make any difference how smart you are, who made the 
guess, or what your name is - if it disagrees with experiment 
it is wrong. That all there is to it." R. P. Feynman 
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imals first and in AIDS patients lateiQ The wonder of an 
eventual positive outcome of these tests is that of helping 
eradicate one by one some of the worst flagels afflicting hu- 
manity. At the end, this (if it cures it is right) would be 
the real outcome of the LES-conjecture, which infinitely 
trascends that of Feynman (if it agrees with the experi- 
ment it is, if not right at least not wrong) in spite of its 
brilliancy. All the work, let alone investments, poured in 
the last decade to map out the consequences and develop 
embodiments of the spontaneous breaking of amino acid 
symmetry content phenomenon, will be wholly justified 
by the cure of even a single infected person. 



Figure [6 
nuclei [1211 



Schematic representation of the two folding 
expected for the HIV-l-PR. Different colors 



mean different amino acid sequence. 



Figure [T] Schematic representation of protein evolu- 
tion starting from short (« 30 aa long) random polymers 
(a) to enzymes (homodimer) (c). It is of notice that a 
similar scenario is obtained by discussing protein evolu- 
tion (folding domains) in terms of single, ~ 100 amino 
acid long chains. Likely, both paths were tried by nature 
(within this context see App. C, as well as [H]). In draw- 
ing the cartoons one had in mind the HIV-1-protease, a 
dimer made out of two identical chains (homodimer) each 
containing 99 amino acids (this is also true for the other 
figures, note however the variance in connection with Fig. 
|2b. 



Figure [2] Schematic representation of the role played 
by (weak and strong) hvdrophobicitv [521I53"] in the folding 
of a protein (see also [M] and App. B). It is of notice that 
the times shown can be considered typical for 100 residue 
long proteins [ST], but not for the HIV-l-PR monomer, 
which folds in times of the order of seconds ED . 



Figure [3J LES-conjecture. The cell is crowded with 
w 10 6 proteins (w 10 2 of each type) To be able to fold, LES 
of one type of proteins must recognize its complementary 
LES and nothing else (see App. D). 

Figure HI Schematic representation of the intervening 
of a process of folding with peptides (p-LES) displaying 
the same sequence as one of the LES of the target protein. 

Figure [5j Schematic representation of the link existing 
between LES, folding and aggregation [18], which is at the 
basis of (non-conventional) folding inhibitors. 



Tests in animals (pharmacokinetics, pharmacotoxicity) and 
in patients (clinical phase I and phase II and eventually III) 
are very expensive, running into the millions of euros the first, 
and in tens if not hundreds of millions the second. And this 
kind of money can only be provided by big pharm, provided 
one has deposited patent requests before publishing the results 
of basic research ( 48,49 ). There are many ways one can in- 
teract with pharmaceutical firms to have such very expensive 
experiments carried out. A very attractive one (in paticular 
if one has the luck to hit on a firm willing to fully support 
cutting edge research without strings attached) is through an 
University-pharmaceutical spin-off like Foldless S.r.l.|50j. 
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A Hierarchical folding 

The fact that the Local Elementary Structure (LES) - 
scenario (-conjecture) under different guises (foldons, fold- 
ing initiation sites, nucleation centers, hydrophobic ini- 
tiation sites, etc), has been discovered again and again, 
testifies to its soundness and universality. In what follows, 
paradigmatic examples of such, almost repetitive although 
in most cases independent "eureka"-like events, spanning 
half a century of research (1962-2011), are briefly dis- 
cussed, starting with work done in the sixties by Anfinsen 
and collaborators (see [STUHIinilSnilSnilSIlEZllMllMllMllSIl 
[65] [66|f67|f68^f69|r7nir7T]r72|[73] . and refs. therein). 

Biological function appears to be more a correlate of 
macromolecular geometry than of chemical detail^] 
and considerable modification of protein sequence may be 
made without loss of function. Mutations and natural se- 
lection are permitted with a high degree of freedom 
(cold sites)\TT during accidental mutation, but a limited 
number of residues (hot and warm szfes) [TT | I18 | . destined 
to become involved in the internal, hydrophobic core of 
proteins (FN), must be carefully conserved^ or at most 
replaced with other residues which display a close chemical 
and physical similarity, let alone hydrophobicity (conser- 
vative mutations) . Only the geometry of the protein and 
its active site need be conserved, except for such residues 
as actually participate in a unique way in a catalytic or 
regulatory mechanism. 

Because a chain of 99 amino acid residues with two 
rotable bonds per residue, each bond having two or three 



s Protein folding, basic physics more than detailed chemistry 
h Note however ref. [74] . It is of notice that, mak- 
ing use of the definition of </>-value[2T]. namely 



(AG 



TS-D 
W 



AGi 



/(AG 



N-D 
W 



AG M~ D ), and of the 



fact that the partial "flickering" in and out of the LES, stabi- 
lized at various degrees by the hot residues, one would get for 
these residues ratios displaying essentially any value, in keep- 
ing with the fact that both numerator and denominator are 
small. 



permissible or favored orientations, would be able to as- 
sume on the order of 4" to 9" different conformations in 
solution, it is necessary to postulate the existence of a lim- 
ited number of allowable initiating events (nucleations) 
in the folding process [75), essentially controlled by hy- 
drophobic forces (see App. B). This is in keeping with 
the fact that in aqueous solution, ionic and hydrogen- 
bonded interactions would not be expected to compete 
effectively with interactions with solvent molecules and 
anything less than a sizeable nucleus of interacting 
amino acid side chains ((hC), (TS), (FN), (FN) pc |]would 
likely have a very short lifetimqj It seems reasonable to 
suggest that portions of a protein chain that can serve as 
nucleation sites for folding will be those that "flicker" 
in and out of the conformation they occupy in the final 
protein (complementary LES) and which, upon docking, 
will form a relatively rigid structural stabilized by a set 
of cooperative interactions. These nucleation centers, in 
what has been termed their "native format", might be 
expected to involve such potentially self-dependent sub- 
structures as helices, pleated sheets, or beta-bendsj 

The examples of noncovalent interaction of comple- 
menting fragments 20, 61! referred to in connection with 
protein evolution gives strong support to the idea that, at 
the basis of protein folding, there appears to exist a very 
fine balance between stable, native protein structure (even 
with low but much larger probability than that of random 



1 (hC): hydrophobic core, (TS): transition state, (FN): fold- 
ing nucleus, (FN) pc : post critical (FN) 

J "... Furthermore, it is important to stress that the amino 
acid sequences of polypeptide chains designed to be the fabric 
of protein molecules only make functional sense when they are 
in the three-dimensional arrangement that characterizes them 
in the native protein structure" (p. 228, ref. [20], third col- 
umn) . In this statement one finds all of what eventually became 
known as inverse folding problem (and associated solution, see 
refs. P11II18II19] and refs. therein) and as the Go-model [75], 
On the actuality of such a statement expressed in 1973, one 
can read the comment "How proteins fold" (ref. [71] , published 
in 2011): "Non native structure has minimal influence on the 
(folding, rab) pathway. If non-native contacts are also insignif- 
icant, the widespread use of Go-models . . . would be justified." 
(p. 465, third column). See also refs. [72] and [73] . 

k Using antibodies which can recognize structured segments 
of staphylococcal nuclease suggests that e.g. approximately 
0.02 percent (2 x 10~ 4 ) of fragment 99-149 exists in the native 
format at any moment. Such a value, although low, is probably 
very large relative to the likelihood of a peptide fragment of a 
protein being found in its native format on the basis of chance 
alone [20] 

1 Because these elements (foldons[68,51 , nucleating (hy- 
drophobic) pockets [60]. folding initiation sites [69], transient 
local structuresjBU], hydrophobic folding units, partially 
folded kinetic intermediates, specific subdomain struct ures |57l 
[58H59 ,6061 6MMl[5T]|6B[66l[671[68]|Ml[70ir7Tl[72][73] . nucleation 
sites 20,37,60 , etc) often are intrinsically unstable, low-energy 
pathways are likely to involve foldons building on top of exist- 
ing structures in a process of sequential stabilization (ref. |71| . 
p. 465, bottom second column) 



10 



R.A. Broglia: A remarkable emergent property of spontaneous (amino acid content) symmetry breaking 



sequences) and random, biologically meaningless polypep- 
tide chains. 

As an example let us refer to the fragment 1-126 of 
staphylococcal nuclease molecule cited in ref. [2U]. This 
fragment contains all of the residues that make up the ac- 
tive center of nuclease. Nevertheless, even if it represents 
about 85% (w 126/149) of the total sequence of the nu- 
clease, it exhibits only about 0.12 percent of the activity 
of the native enzyme. The further addition of 23 residues 
during biosynthesis, or the addition in vitro, of residues 
99-149 as a complementing fragment, restores the stability 
required for activity to this unfinished gene translation. 

The process of folding can be shown to take place in 
at least two phase^] An initial rapid folding with a half- 
time of about 50ms, and a second, somewhat slower trans- 
formation with a half-time of about 200ms. The first phase 
is essentially temperature-independent (on therefore pos- 
sibly entropically driven, WHI) and a second temperature- 
dependent (siri) [51531. 



B Weak and strong hydrophobic interactions 

From Fig. B.l, corresponding to Fig. 2 (p. 642) of ref. 
|52j . one can extract information concerning the Strong 
Hydrophobic Interaction (SHI) (see also ref. [53 ) propor- 
tional to the surface (S = AttR 2 ) of the hydrophobic cavity 
(soluteg 

Z\G| HI « 7 5 = BS , 



where 

B = 7 « 7 x 10~ 2 J/m 2 , 

is the liquid-vapour surface tensiorF] 



m The first phase is essentially temperature-independent 
(and therefore possibly entropically driven) the second being 
temperature-dependent (see ref. [2D], p. 228 first column and 
beginning second one). In keeping with Kramers, the transi- 
tion rate from one phase to another is given by the relation 
K ~ exp t . In the case in which there is no barrier between 
the two phases AE — > AE e ff — —TS and K ~ exp~ = exp s . 
That is, the rate K is temperature independent and thus en- 
tropically driven. In other words, there are no barriers to be 
overcome to find the new phase. The polypeptide chain un- 
dergoes a random walk in conformation space until it clicks 
on the conformation characteristic of such a phase. Assuming 
contact interactions the system will acquire the needed sta- 
bility to eventually undergo the second, enthalpic driven and 
thus temperature dependent phase, in which the system has to 
overcome a potential barrier. If, however, the force driving the 
system is finite range, like e.g. the hydrophobic force, then the 
clicking and stabilization on the conformation characteristic of 
the first phase has to take place under the entropically driven 
WHI component of this interaction (for groups of hydrophobic 
residues which can fit together in a compact conformation of 
radius < 1 nm, it is the volume dependent, entropic controlled 
Weak Hydrophobic Interaction which stabilizes the system 
(or the different groups), see App. B, in particular Fig. B.l and 
B.2. 



n Dissolving a substance in a solvent can be regarded as 
transforming a system from state 1 (pure solvent) to state 
2 (solvent plus solute). This process is associated with a 



DC 

1= 

CD 




Figure B.l: Schematic representation of the numerical results 

reported in ref. 52 concerning the solvation free energy 

AG = G2 — G\ for a spherical cavity in water normalized 

with respect to the surface area of the circumference resulting 

from the intersect of the cavity with a plane containing its 

center as a function of the cavity radius (room temperature 

and 1 atm of pressure) . The liquid-vapour surface tension is 

denoted by 7 {AG > means that G2 is less negative than 

G\. In other words, immersing a non-polar molecule in water 

the free energy increases) . 

From Fig. B.l one observes that the calculated value of 
AG/4ttR 2 is equal to 50 mJ/m 2 for R — 0.6 nm (1 nm = 



change in the free energy AG = Gi 
positive for hydrophobic solutes. 



Gi, a quantity which is 



° The surface tension is a measure of the force that must be 
applied to surface molecules so that they experience the same 
force as molecules in the interior of the liquid. Surface ten- 
sion exists because of attractive forces between the molecules 
in the bulk liquid (4 hydrogen-bonds in average) and the 
molecules in the surface (« 2 hydrogen-bonds in average). 

A molecule at the sur- 
face experiences a net in- 
ward force. This is the 
reason why a mosquito 
can walk on the water 
surface. Surface tension 
can be measured in stan- 
dard experiments, by de- 
stroying part of the sur- 
face and recording how 
much work it takes to reconstruct it (see cartoon). At ambi- 
ent conditions (room temperature and 1 atm pressure) liquid 
water lies close to phase coexistence with its vapour (in the fig- 
ure it is shown a typical device used to learn about interaction 
controlling leptodermic systems is schematically shown). 
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10 9 m = 10 A), in the region in which AG coincides with 



AG^ m . Thus 



AG^ m 
4nR 2 



Then 



where 



AG 



WHI 



R 



V 

AG^ m 



= 50x 10~ 3 J/m 2 , (R = 0.6 nm) , 



3x 50 
0.6 nm 



x 10~ 3 J/m' 



AG^ m = AV , 



A « 3 x 10 8 J/m 3 



The crossing radius, that is the value of the radius of the 
hydrophobic particle or group of particles (see Fig. 1 ref. 
|52j ) which marks the transition from the volume depen- 
dent, entropic WHI regime, to the surface controlled, en- 
thalpic SHI situation (see also Fig. B.2) is determined by 
the relation 



AGy (Across) — AG g (-ftcross) j 



that is 



AV = BS . 
Making use of the relation 



V 

~s 



4 77 



R 3 



AttR 2 



R 
3' 



and 



B 7 x 10~ 2 J/m 2 7 ln 

A = 3x10* J/m =3 Xl ° m ' 



one obtains 



that is, 



R ^^- = 7 x 10- 10 m 
3 3 



Rr 



0.7 nm 



In other words, the change of regime corresponds to a 
radius of the order of 1 nm (=10 A). Making use of the 
fact that the average Van der Waals volume of the 20 
most common amino acids is ra 120 A 3 (=> i? aa (VW) = 
Raa ~ 3.1 A) and that the average range of the associated 
(attractive) interaction is ~ 0.5 A- 1 A, one expects that 
the average Wigner cell radius of an amino acid is R aa ~ 
4.5 — 5 A. From this estimate one obtains that the number 
n I ~ (Rcross/ Raa) ) of amino acids which can fit into the 

largest hydrophobic cavity which immersed in water does 
not deplete hydrogen bonds (WHI) is bound between the 

values (10 A/4.5 A) 3 and (10 A/5 A) 3 , that is 
8 < n < 11 . 

It is strongly suggestive that this number is similar to that 
corresponding to the number of amino acids forming LES 
of typical globular proteins. This is even more so if one 
takes into account the fact that hydrophobicity in con- 
nection with amino acids refers principally to side chain 



transfer experiments (vapour — > liquid water) ignoring the 
hydrophobic parts of the backbone. 

Of notice that the hydrophobic (WHI) force is likely 
to lead to metastable (see also [79]), strongly fluctuating, 
LES (structuring probabilities ~ few %; within this con- 
text see ref. |20j ) in keeping with their size < 1 nm, and 
only to sizable stabilities (SHI) of the (postcritical) FN. 



C Enzymatic retention of native structural 
"memory". Protein evolution in terms of few 
"short" segments 

The 124— residue bovine pancreatic ribonuclease A 
(RNase A) has been extensively studied, let alone chemi- 
cally synthesized. It was found, among other things, that 
the C-terminal peptide 111-124, the N-terminal peptide 
1-20, and the central protein component 21-118 could be 
mixed together non-covalently and ribonuclease activity 
would be generateddEZEP. 

Other examples of retention of native structural "mem- 
ory" by segments of a protein, have been found with com- 
plexing fragments of the staphylococcal nuclease molecule, 
a calcium-dependent, RNA- and DNA-cleaving enzyme 
containing 149 amino acids and devoid of disulfide 
bridges and sulphydryl groups. The protein is digested 
by proteolytic enzymes. Tripsin, for example, cleaves the 
staphylococcal nuclease enzyme at a number of sites. The 
resulting fragments (residues 6 to 48) and (49 to 149) or 
(50 to 149) are devoid of detectable structure in solution. 
However, as in the case of ribonuclease S, when the frag- 
ments are mixed in stoichiometric amounts, regeneration 
of activity, about 10%, and of native structure character- 
istics occurs[2D], the complex being known as nuclease T. 



D Folding inhibition 

As stated in ref. [2D], methods that depend on hydrody- 
namic or spectral measurements are not able to detect the 
presence of the flickering in and out of the native confor- 
mation of the nucleation sites (virtual processes) . On the 
other hand, a method which can reveal such events and 
which was employed in a study of the folding of staphy- 
lococcal nuclease and its fragments, is based on specific 
antibodies against restricted portions of the amino 
acid sequence^ 



p Of notice that this idea translates the sequence based strat- 
egy developed in ref. [80] (see also [48]) from peptides to anti- 
bodies (Eugenio Cesana, private communication). Within this 
context cf. 1811. 
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Figure B.2: (a) Entropic, Weak Hydrophobic Interaction (WHI). Non-polar (NP) molecules of small dimension carry around 

solvation shells which preserve the hydrogen bonds of bulk water, although "freezing" the solvent molecules forced to go 

around the NP molecule. Assembling them together frees a number of water molecules, thus increasing entropy (AS > 0). 

This mechanism, so called weak hydrophobic interaction, is operative for values of the radius of the conglomerate < lnm. 

(b) Enthalpic, Strong Hydrophobic Interaction (SHI). Large non-polar molecules or clusters of NP molecules (R > lnm) 

represented by extended plates, impedes solvent molecules to form hydrogen bonds as in bulk water. The number of these 

molecules is drastically reduced, with a net gain of enthalpy, by overlapping the two plates, (c) Of notice that at biological 

conditions (room temperature and 1 atmosphere pressure), water is essentially in equilibrium with its vapour, the phase 

essentially found between plates. Fluctuations are able to expel these molecules, thus reducing the internal pressure and 

allowing the plates to come into contact under the pressure of the, external, water molecules. One can thus view the effect of 

SHI on NP molecules as a generalized Casimir effect [771178] . 
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Antibodies against specific regions of the nuclease 
molecule were prepared by immunization of goats with ei- 
ther polypeptide fragments of the enzyme or by infection 
of the intact, native protein with subsequent fractiona- 
tion of the resulting antibody in correspondence with the 
protein segments of interest. 

In the former manner an antibody was prepared specif- 
ically directed against the polypeptide, residues 99-149, 
known to exist in solution as a random chain without the 
extensive helicity that characterizes this portion of the 
nuclease chain when present as part of the intact enzyme. 
Such an antibody preparation is referred to as anti-(99 
to 149) r , the subscript indicating the, random, disordered 
(denatured) state of the antigen. 

A similarly specific antibody for the sequence (99 to 
149) was obtained but this time by fractionation of an- 
tiserum to native nuclease. While this fraction, termed 
anti-(99 to 149)™ exhibited a strong inhibitory effect on 
the enzymatic activity of nuclease, anti-(99 to 149) r was 
devoid of such an effect. In keeping with this result, the 
subindex n refers to the native format of this bit of se- 
quence. 

Similar inhibitory effects, but this time making direct 
use of polypeptides displaying identical sequence to seg- 
ments of the target protein, were found in the case of the 
124— residue bovine pancreatic ribonuclcase A (RNase 
A). The peptide His 105 - Val 124, which forms in the na- 
tive conformation of the protein a /3-pleated sheet, com- 
pletely inhibits the refolding of this protein at a concen- 
tration of 10 fiM, from the reduced, denatured state at a 
1:1 molar ratio of peptide to refolding protein. It has also 
been observed complete inhibition of refolding by peptides 
11-31 and 40-61, but this time at concentrations 100 /uM 
and > 100 fjM respectively. 

The basis for a possible explanation of these observa- 
tions is the fact that, if a segment of a protein adopts 
a native-like conformation as an isolated peptide, it may 
inhibit protein refolding if this segment of the protein is 
involved in folding in an early stage of the refolding pro- 
cess. Inhibition would result from competition of the ex- 
ogeneous peptide with its counterpart in the protein for 
interacting with complementary regions of the refolding 
protein (cf. ref. |J2], see also [52"]). 
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